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DETAILED ACTION 

1 . This action is in response to the amendment filed on: 12/28/2006. 

2. Claims 1, 14, 18, 32, 38 are amended. Claims 13, 17, 33, 37, 39, and 40 are 
cancelled. Thus, claims 1-12, 14-16, 18-32, 34-36, 38, 41, and 42 are pending. 

3. Prior 35 USC 112 rejections for claim 14, is withdrawn in view of amendment. 

4. Prior rejections for claims 1-12, 14-16, 18-32, 34-36, 38, 41, and 42 are 
withdrawn in view of amendment made by applicant. 

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 1-4, 6-12, 14, 38, and 42 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Tang et al (US Patent: 6,636,849 B1, issued: Oct. 21, 2003, filed: 
Nov. 22, 2000), Shanahan et al (US Application: US 2003/0033288 A1, published: Feb. 
13, 2003, filed: Dec. 5, 2001), and Beeferman et al (US Patent: 6,701,309 B1, issued: 
Mar. 2, 2004, filed: Apr. 21 , 2000) in further view of Birman et al (US Patent: 6,616,704 
B1 , issued: Sep. 9, 2003, filed: Sep. 20, 2000). 

With regards to claim 1 , Tang et al teaches a system that facilitates spell checking 
comprising: 

• A component that receives input data containing text (column 4, lines 55-66: 
whereas a search string is received) 
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• A spell checking component that identifies potentially misspelled strings in the 
text, and proposes at least one alternate spelling for the string (column 7, 
lines 20-30: whereas, the Tang et al's system teaches spell checking 
potentially misspelled words using a dictionary/lexicon, and returning a 
suggestion to the user concerning a least one alternate spelling.) 
However, Tang et al does not teach creating/using substrings of the text, and 
providing an alternate spelling for the substring set, based on at least one query log; the 
query log comprising data utilized by users to query a data collection over a time frame, 
the spell checking component 

Shanahan et al teaches a spell checking system (paragraph 0518: whereas, 
Shanahan et al's system takes text, and identifies text that need spelling corrections). 
The spell checking system takes substring data from the input text (Fig 31: whereas, 
text from a document is processed by tokenizing words, and identifying N-Gram of 
words from the input text after removal of stop words). The iteration where alternative 
spellings for potentially misspelled words that are not stop words in the same substhng 
are identified: All words (substrings of the input text) are iteratively are processed and 
corrected to generate a set of alternate spellings for the input text as shown in Fig. 51. 
The iterative process where stop words in the substhng are identified (paragraph 0365). 
Furthermore, stop-word-sequence-skipping counts are implemented, to further refine 
the spell checking process (paragraph 305: whereas, in expert mode, only entities that 
occur in referenced documents with a (tracked/logged) frequency below a predefined 
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threshold are annotated (for the purpose of detecting and ignoring/skipping stop words 
(for which the stop words have a count above a predefined threshold)). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al's spell checking component to further take and 
process and provide alternate spellings for substrings of input text (through various 
techniques, including logged/tracked stop word skipping counts) as taught by Shanahan 
et al. The combination of Tang et al and Shanahan et al would have allowed Tang et 
al's system to have "identified errors in a document, by formulating a query using 
identified errors in document content, identifying a set of entities in the database of 
entities that satisfies the query; correcting the document content using the identified set 
of entities, and updating the information space with the corrected document content" 
(paragraph 0015). 

However, although Tang et al and Shanahan et al teach the implementation of a 
spell checking component for alternate spelling of a substring set, through various 
techniques including the use of tracked/logged data (stop-word-skipping-counts), as 
explained above, they do not expressly teach the alternate spelling of a substring set is 
based on at least one query log; the query log comprising data utilized by users to query 
a data collection over a time frame, and an iterative process where alternative spellings 
for potentially misspelled stop words in the substring are identified in an iteration after 
the iteration where alternative spellings for potentially misspelled words that are not stop 
words in the same substring are identified. 
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Beeferman et al teaches an alternate spelling of a string query (column 1, lines 
51-54: whereas, a system for query refinement includes suggesting an alternate spelling 
or a corrected spelling for a query) is based on data stored in a query log file (columns 
10 and 11, lines 52-64 and 1-10 respectively: whereas, based on heuristic data from 
query log data, it is determined if a suggested spelling is appropriate), the query log 
comprising data utilized by users to query a data collection over a time frame (Table 2, 
column 9, lines 45-67, and column 10, lines 1-6: whereas a query log holds data about 
the number of occurrences for each particular query/string has been submitted by a 
particular class of users in searches, over a period of time), and substring occurrence 
and co-occurrence statistics from the at least one query log: in Table 2 (whereas, the 
query log comprises the number of occurrences for each particular query (query 
substring pair) requested. Furthermore, the number of occurrences are shown to be 
greater than one, and thus, co-occurrence counts for each of the particular queries are 
also recorded. 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, and Shanahan et al's spell checking component 
such that spell checks using substrings (and logged/tracked word-skipping-count data), 
to further include the logged tracked data with the heuristic query log data, that is taught 
by Beeferman et al. The combination of Tang et al, Shanahan et al, and Beeferman et 
al would have allowed Tang et al's system to have been able to have implemented a 
spell checking system that would have "refined a presentation of an alternative query to 
a first query based on a searcher's tendency to utilize information" (column 2, lines 27- 
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30), and to have also " collected related queries that have a likelihood of being 
submitted by a class of searcher" (column 2, lines 24-26). 

However, although Tang et al, Shanahan et al, and Beeferman teach the spell 
checking component, and also a first set of words (non-stop words), and a second set of 
words (stop-words), they do not teach correcting the and an iterative process where 
alternative spellings for potentially misspelled stop words in the substring are identified 
in an iteration after the iteration where alternative spellings for potentially misspelled 
words that are not stop words in the same substring are identified. 

Birman et al teaches the an iterative process where alternative spellings for 
potentially misspelled stop words in the substring are identified in an iteration after the 
iteration where alternative spellings for potentially misspelled words that are not stop 
words in the same substring are identified (column 1 , lines 30-46: whereas the "stop 
words" are secondary candidate words that are ignored in the first spell check phase 
(since the first spell check phase, performs a spell check on a primary set (non-stop 
words) of candidate words that are not eliminated from consideration). The secondary 
candidate words (stop-words), are then brought back into consideration for a more 
thorough spell check scan). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, and Beeferman et al's spell 
checking component, such that a secondary group of words (stop words) are spell 
checked after a primary group of words (non-stop words) have been spell checked, as 
taught by Birman et al. The combination of Tang et al, Shanahan et al, Beeferman et al 
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and Birman et al would have allowed Tang et al to have implemented a "very fast 
method for correcting the spelling of a word or phrase in a document" (Birman et al, 
column 1, lines 6-11). 

With regards to claim 2, which depends on claim 1, for a spell checking 
component further utilizes user-dependent information in proposing at least one 
alternative spelling, is similarly taught by Tang et al, Shanahan et al, Beeferman et al, 
and Hitachi, in claim 1 , and is rejected under the same rationale. 

With regards to claim 3, which depends on claim 1, Tang et al teaches the 
alternative spelling for the substring set is further based on at least one trusted lexicon 
with content (column 7, lines 23-29: whereas, a dictionary which comprises the correct 
spelling of words, is used as a basis for providing an alternative spelling). 

With regards to claim 4, which depends on claim 3, Tang et al teaches the spell 
checking component, in claim 1 , and is rejected under the same rationale. However 
Tang et al does not teach the spell checking component further employs a list of stop 
words. 

Shanahan et al teaches a list of stop words with content (paragraph 0365: 
whereas, a set/list of stop words are used to normalize input text data for contextual 
classification). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al's spell checker such that the input text data will be 
normalize by removing stop words as taught by Shanahan et al. The combination would 
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have allowed Tang et al's spell checker to have been able to remove stop words "that 
do not improve the quality of classification" (paragraph 0365). 

With regards to claim 6, which depends on claim 4, Tang et al teaches a spell 
checking component, in claim 1, and is rejected under the same rationale. Furthermore, 
Tang et al teaches an iterative process to search a space of alternative spellings (Fig 6, 
column 12, lines 1-30: whereas, processing for exact or inexact matches are performed 
on a search tree start, and iterations or a loops take place until the last level of a search 
tree is reached (looping occurs from reference numbers 630 to 670 and then back to 
630 in Fig 6). 

With regards to claim 7, which depends on claim 6, Tang et al teaches a spell 
checking component, in claim 1, and is rejected under the same rationale. Furthermore, 
Tang et al teaches at least in part, heuristics to impose restrictions on a search space < 
utilized to determine a proposed alternative spelling (column 15, lines 60-67, and 
column 16, lines 1-5: whereas heuristic methods are used to impose restrictions on a 
search space by calculating a distance score that is used in determining candidates for 
alternative spellings)). 

With regards to claim 8, which depends on claim 7, Tang et al teaches the 
heuristics in claim 7, and is rejected under the same rationale. Furthermore, Tang et al 
teaches the heuristics utilize, at least in part, at least one fringe to limit the search space 
(column 9, lines 59-67, and column 10, lines 1-5: whereas, several fringes are 
implemented to limit the search space, such as the probabilistic distance function 
having to be positive, and a triangle inequality has to be satisfied). 
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With regards to claim 9, which depends on claim 4, Tang et al, Shanahan et al,, 
Beeferman et al, and Birman et al similarly teach the query log comprising a histogram 
of queries asked over a time frame, as explained in claim 1 , and is rejected under the • 
same rationale. 

With regards to claim 10, which depends on claim 9, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach the histogram of queries, as explained in claim 
9, and is rejected under the same rationale. Tang et al, Shanahan et al, Beeferman et 
al, and Birman et al also teach the histogram of queries relates to a subset/class of the 
users, as explained in claim 1, and is rejected under the same rationale. Furthermore, 
the subsef/class comprises at least one user (Beeferman et al, column 5, lines 7-9: 
whereas a particular class of searchers represents a subset of users with at least one 
searcher/user). 

With regards to claim 1 1 , which depends on claim 9, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach a query log, as explained in claim 1, and is 
rejected under the same rationale. Furthermore, Beeferman et al teaches the query log 
resides on a server computer (column 5, lines 39-40: whereas the query log is 
downloaded from a search engine/server computer to the client computer, and thus the 
query log originally resides on the server computer). 

With regards to claim 12, which depends on claim 9, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach a query log, as explained in claim 1 , and is 
rejected under the same rationale. Furthermore, as explained in claim 1 1, the query log 
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is downloaded from the server to the client, and thus the query log resides on the client 
computer as well. 

With regards to claim 14, which depends on claim 1, Tang et al, Shanahan et al, , 
Beeferman et al, and Birman et al teaches a substring comprising at least one selected 
from a groups consisting of an entry in at least one lexicon, as explained in claim 3, 
and is rejected under the same rationale. 

With regards to claim 38, Tang et al, Shanahan et al, Beeferman et al, and Birman et 
al, similarly teach a system comprising: 

• Means for receiving input data containing text, as described in claim 1 , and is 
rejected under the same rationale. 

• Means for identifying a set of potentially misspelled substrings in the text and 
proposing at least one alternative spelling for the substring set based on at least 
one query log; the query log comprising data utilized by users to query a data 
collection over a time frame, as described in claim 1, and is rejected under the 
same rationale. 

• The means employing an iterative process that identifies alternative spellings for 
potentially misspelled stop words in the substrings in an iteration after an iteration 
identifies alternative spellings for potentially misspelled words that are not stop 
words in the same substrings, as similarly explained in the rejection for claim 1 , 
and is rejected under the same rationale. 

With regards to claim 42, Tang et al, Shanahan et al, Beeferman et al, and 
Birman et al similarly teach a device employing the system of claim 1 , comprising at 
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least one of a computer, as server, and a handheld electronic device, as explained in 
claim 1 , and is rejected under the same rationale. 

6. Claims 15, 18, and 20-24 are rejected under Tang et al (US Patent: 6,636,849 
B1, issued: Oct. 21, 2003, filed: Nov. 22, 2000), Shanahan et al (US Application: US 
2003/0033288 A1, published: Feb. 13, 2003, filed: Dec. 5, 2001), and Beeferman et al 
(US Patent: 6,701,309 B1, issued: Mar. 2, 2004, filed: Apr. 21, 2000) and Birman et al 
(US Patent: 6,616,704 B1, issued: Sep. 9, 2003, filed: Sep. 20, 2000), in further view of 
Hitachi (Derwent, published: Feb 16, 2001, Abstract). 

With regards to claim 15, which depends on claim 1, although Tang et al, 
Shanahan et al, Beeferman et al, and Birman et al teach query log data, as explained in 
the rejection for claim 1 above, they do not expressly teach substring bigram comprising 
a pair of substrings in a text. 

Hitachi however teaches a substring bigram comprising a pair of substrings in a 
text (Abstract: whereas, a collecting unit collects substrings / bigram strings from a 
document, and a counter counts the occurrence(s) for the pair of bi-grams). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, Beeferman et al, and Birman et 
al's query log data/statistics such that they also include bigram statistics/counts from a 
pair of substrings in a text/document, as taught by Hitachi. The combination of Tang et 
al, Shanahan et al, Beeferman et al, and Hitachi, would have allowed Tang et al's spell 
checking component to have been able to evaluate each pair of bigrams "in order of 
degree of importance" (Hitachi, Abstract). 
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With regards to claim 18, which depends on claim 1, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach substring co-occurrence statistics from the 
query log, as similarly explained in claim 1, and is rejected under the same rationale. 
Furthermore, the query information is stored in a single data structure/log by 
downloading from a server as explained in claim 11, and is rejected under the same 
rationale. 

With regards to claim 20, which depends on claim 18, Tang et al and Shanahan 
et al teach a spell checking system handling split substrings by splitting input text into 
an N-Gram set of words, as explained in claim 1, and is rejected under the same 
rationale. Furthermore, Shanahan et al teaches a method for using heuristics to 
determine word similarity/which does not differ/ (operates in the same manner), if the 
input text is an individual string or an N-Word split substring using the searching 
technique that was explained in claim 6 above. 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al's substring spell checking 
component, to further utilize the string (individual or substring) independent method for 
searching a search space, as also taught by Shanahan et al. The combination of Tang 
et al, Shanahan et al, Beeferman et al, Birman et al, and Hitachi would have allowed 
Tang et al's spell checking component to have been able to have provided for expanded 
search results if needed by searching split substrings. 

With regards to claim 21, which depends on claim 20, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach the spell checking component generates a set 
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of alternative spellings that are substrings in a at least one selected from the group 
consisting of at least one query log and at least one lexicon, as explained in claim 1 , 
and is rejected under the same rationale. 

With regards to claim 22, which depends on claim 21 , Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teach the set of alternative spellings comprising a set 
of alternative spellings, as explained in claim 1, and is rejected under the same 
rationale. Furthermore, Shanahan et al teaches the alternative spellings are determined 
via an iterative correction process (paragraph 051 1: whereas, through an iterative 
correction process, text/string/substring in a document gets replaced/corrected with 
another substring as an alternative spelling. Furthermore, the iterative correction 
process halts when all the number of errors corrected at a previous iteration is less than 
a threshold value, and thus the possible alternative spellings are less appropriate than 
the current set of alternative spellings). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, Beeferman et al, and Birman et 
al's spell correction component, to have further implemented the generation of alternate 
spellings in an iterative correction process as also taught by Shanahan et al. The 
combination of Tang et al, Shanahan et al, Beeferman et al, Birman et al, and Hitachi 
would have allowed Tang et al's system to have repeatedly analyzed input text content 
until a satisfying correction level has been established. 

With regards to claim 23, which depends on claim 22, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, and Hitachi teach the iterative correction process, 



Application/Control Number: 10/801,968 Page 14 

Art Unit: 2178 

comprising a plurality of iterations that change at least on substring to another substring 
as an alternative spelling, the iterative correction process halts when all possible 
alternative spellings are less appropriate than a current set of alternative spellings, as 
explained in claim 22, and is rejected under the same rationale. 

With regards to claim 24, which depends on claim 23, Tang et al teaches 
alternative spellings, in claim 1, and is rejected under the same rationale. Tang et al 
also teaches the appropriateness of alternative spellings are computed based on a 
probabilistic string distance, as explained in claim 7, and is rejected under the same 
rationale. Tang et al however, does not teach the appropriateness of alternative 
spellings are computed based on a statistical context model. 

Shanahan et al teaches the appropriateness of alternative spellings are 
computed based on a statistical context model (paragraph 0243: whereas the context of 
the words surrounding a substring/entity is taken into account, and using ranking 
methods, only the highest ranked results are kept as appropriate for an alternative 
spelling). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al's spell correction system for providing alternative 
spellings, to further include the ability to determine appropriateness for alternative 
spellings based on not only string distance, but through context - statistical analysis as 
well. The combination of Tang et al, Shanahan et al, Beeferman et al, Birman et al, and 
Hitachi would have allowed Tang et al's system to have improved the accuracy of 
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alternative spellings by taking the context of the input text into account when providing 
alternative results. 

7. Claim 5 is rejected under 35 U.S.C. 103(a) as being unpatentable over Tang et al 
(US Patent: 6,636,849 B1, issued: Oct. 21, 2003, filed: Nov. 22, 2000), Shanahan et al 
(US Application: US 2003/0033288 A1, published: Feb. 13, 2003, filed: Dec. 5, 2001), 
and Beeferman et al (US Patent: 6,701,309 B1, issued: Mar. 2, 2004, filed; Apr. 21, 
2000) and Birman et al (US Patent: 6,616,704 B1, issued: Sep. 9, 2003, filed: Sep. 20, 
2000), in further view of de Hita et al (US Patent: 6,081,774, issued: Jun. 27, 2000, filed: 
Aug. 22, 1997). 

With regards to claim 5, Tang et al and Shanahan et al, teach a list of stop 
words, as explained in claim 4, and is rejected under the same rationale. However, 
Tang et al and Shanahan et al do not teach the list of stop words containing high 
frequency words and function words and their frequent misspellings. 

Hita et al teaches a list of stop/skip words containing high frequency words, 
function words, and their frequent misspellings: whereas a stop/skip list is implemented 
for high frequency words and function words (column 1, lines 43-44), and frequent 
misspellings (column 2, lines 10-12). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, and Shanahan et al's list of stop words to further 
include high frequency words, function words, and their frequent misspellings as taught 
by Hita et al. The combination of Tang et al, Shanahan et al, Beeferman et al, Birman et 
al, and Hita et al would have allowed Tang et al's spell checking component to have 
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been able to normalize an input data/string set to focus on words that provide more 
semantic content. 

8. Claim 16 is rejected under 35 U.S.C. 103(a) as being unpatentable over Tang et 
al (US Patent: 6,636,849 B1 , issued: Oct. 21, 2003, filed: Nov. 22, 2000), Shanahan et 
al (US Application: US 2003/0033288 A1, published: Feb. 13, 2003, filed: Dec. 5, 2001), 
Beeferman et al (US Patent: 6,701,309 B1, issued: Mar. 2, 2004, filed: Apr. 21, 2000), 
Birman et al (US Patent: 6,616,704 B1, issued: Sep. 9, 2003, filed: Sep. 20, 2000), and 
Hitachi (Derwent, published: Feb 16, 2001 , Abstract), in further view of Herz et al (US 
Patent: 5,754,939, issued: May 19, 1998, filed: Oct 31, 1995). 

With regards to claim 16, which depends on claim 15, Tang et al, Shanahan et al, 
and Beeferman et al, Birman et al, and Hitachi teach the substring bigram comprising a 
pair of substrings in text, as explained in claim 15, and is rejected under the same 
rationale. However, Tang et al, Shanahan et al, Beeferman et al, and Hitachi do not 
expressly teach that the big rams are adjacent substrings in a text. 

Herz et al teaches the bigrams are adjacent substrings in a text (column 13, lines 
28-30: whereas, text is broken into bigrams, which are 2 adjacent words). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, Beeferman et al, and Hitachi's 
spell checking component to further include the ability to process substring bigrams that 
are adjacent in a text, as taught by Herz et al. The combination of Tang et al, Shanahan 
et al, and Beeferman et al, Hitachi et al, Birman et al, and Herz et al would have allowed 
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Tang et al's spell checking component to have been able to process bigrams that are 
contextually close to each other. 

9. Claims 19, 26-32, 34-36, and 41 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Tang et al (US Patent: 6,636,849 B1 , issued: Oct. 21 , 2003, filed: 
Nov. 22, 2000), Shanahan et al (US Application: US 2003/0033288 A1, published: Feb. 
13, 2003, filed: Dec. 5, 2001), Beeferman etal (US Patent: 6,701,309 B1, issued: Mar. 
2, 2004, filed: Apr. 21, 2000), Birman et al (US Patent: 6,616,704 B1, issued: Sep. 9, 
2003, filed: Sep. 20, 2000), and Hitachi (Derwent, published: Feb 16, 2001, Abstract), in 
further view of Srihari et al (ACM, published: January 1983, pages 72-75). 

With regards to claim 19, which depends on claim 18, Tang et al teaches a tree 
data structure extracted from a lexicon (column 7, lines 21-23). However, Tang et al 
does not teach a data structure comprising a trie. 

Srihari et al teaches a data structure comprising a trie (Section 3, P3-1, Figure 2: 
whereas, a data structure used to represent a lexicon is a trie). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al's method for representing a lexicon in the form of 
a trie, as taught by Srihari et al. The combination of Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, Hitachi, and Srihari et al, would have allowed Tang et al's 
spell checking component to have implemented a "data structure that is suitable for 
determining whether a given string is an initial substring" (Srihari et al, Section 3, P3-2). 

With regards to claim 26, which depends on claim 24, Tang et al, Shanahan et al, 
and Beeferman et al teaches a set of alternative spellings for a substring is generated, 
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in as explained in claim 1, and is rejected under the same rationale. Tang et al also 
teaches a searchable string data structure extracted from a trusted lexicon (column 7, 
lines 22-24: whereas, a structured tree for a whole dictionary/lexicon is created for 
searching). Furthermore, Beeferman teaches a searchable query log data structure 
(Table 2, whereas, a flat data structure is used to store occurrence and co-occurrence 
query data). However, Tang et al, Shanahan et al, Beeferman et al, Birman et al, and 
Hitachi do not expressly teach a searchable substring data structure. 

Srihari et al teaches a searchable substring data structure (Page 72-73, Section 
3. Lexical Organization, Fig. 2: whereas, a trie is extracted from a lexicon, for which the 
trie is used to implement a searchable substring data structure using the Viterbi 
algorithm). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, and Beeferman et al's spell 
correction system such that trie data structures storing substrings are extracted from a 
particular source (such as a lexicon or query log), such that the alternative spellings are 
generated from a trie using a Viterbi algorithm, as taught by Srihari et al. The 
combination of Tang et al, Shanahan et al, Beeferman et al, Birman et al, Hitachi, and 
Srihari et al would have allowed Tang et al's spell checking system to have used a data 
structure that is "efficient for text correction algorithms" (Srihari et al, page 72, Section 
3). 

With regards to claim 27, which depends on claim 26, Tang et al and Shanahan 
et al teach the processing of substrings from input text, in claim 1, and is rejected under 
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the same rationale. Also Tang et al teaches the set of alternative strings for each string 
query is restricted to within a probabilistic distance from an input string (column 10, lines 
32-42: whereas, alternative spellings for a string are based on several factors, including 
the probabilistic distance); the restriction is imposed within each iteration without limiting 
the iterative correction process as a whole (column 10, lines 48-60: whereas, the 
process is repeatedly extended to multiple search spaces or "grids"). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have used Tang et al, and Shanahan et al's method for processing 
substrings from input text, and additionally used Tang et al's method for iteratively 
processing the search space with a substring by applying probabilistic distance 
calculations. The combination of Tang et al, Shanahan et al, Beeferman et al, Birman et 
al, Hitachi, and Srihari et al would have allowed Tang et al's system to increase the 
speed and relevancy of possible alternative spellings for a given input substring. 

With regards to claim 28, which depends on claim 27, Tang et al, Shanahan et al, 
Beeferman et al, and Birman et al teaches the iterative correction process, in claim 6, 
and is rejected under the same rationale. Furthermore, Shanahan et al teaches an 
iterative correction process searches for an optimum set of alternative spellings via 
utilization of a statistical context model: whereas the context of the words surrounding a 
substring/entity is taken into account, and using ranking methods, only the highest 
ranked results are kept as appropriate for an alternative spelling (paragraph 0243) and 
the iterative correction process halts when all the number of errors corrected at a 
previous iteration is less than a threshold value, and thus the possible alternative 
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spellings are less appropriate than the current set of alternative spellings (paragraph 
051 1 : iterative process stops when the number of errors corrected is less than a 
threshold (optimal value)). Additionally, as explained in the rejection for claim 1, 
Beeferman et al also teaches a statistical context model comprising substring 
occurrence and co-occurrence statistics extracted from at least one query log (shown in 
Table 2 of Beeferman et al). 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have further modified Tang et al, Shanahan et al, Beeferman et al, and 
Birman et al's iterative correction system to further include the ability to use a statistical 
context model to determine an optimal set of alternative spellings, as taught by 
Shanahan et al. The combination of Tang et al, Shanahan et al, Beeferman et al, 
Birman et al, Hitachi, and Srihari et al, would have allowed Tang et al's spell checking 
component to have been able to iteratively go though a search space, and choosing 
alternative spellings based on context of the input sentence/string/substring. 

With regards to claim 29, which depends on claim 28, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, and Hitachi teach the statistical context model, 
comprising substring occurrence and co-occurrence statistics extracted from at least 
one query log, as similarly explained in the rejection for claim 28, and is rejected under 
the same rationale 

With regards to claim 30, which depends on claim 29, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, Hitachi, and Srihari et al teach: 
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• A Viterbi search is employed to facilitate in determining the optimum set of 
alternative spellings, as explained in claim 26, and is rejected under the same 
rationale. 

• Alternate spellings are determined according to the context model in each 
iteration, as explained in claim 28, and is rejected under the same rationale. 

With regards to claim 31, which depends on claim 31, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, Hitachi, and Srihari teach the Viterbi search, as similarly 
explained in the rejection for claim 30, and is rejected under the same rationale. 
However the combination of Tang et al, Shanahan et al, Beeferman et al, Birman et al, 
Hitachi, and Srihari as explained above, does not teach the Viterbi search can employ 
fringes to restrict a search for alternate spellings in an iteration such that for every pair 
of adjacent substrings, if any of the substrings is in at least one trusted lexicon, then 
only one of the substrings is allowed to change in that iteration. 

Yet, Srihari teaches search can employ fringes to restrict a search for alternate spellings 
in an iteration such that for every pair of adjacent substrings, if any of the substrings is 
in at least one trusted lexicon, then only one of the substrings is allowed to change in 
that iteration. The Viterbi search can employ fringes to restrict a search for alternative 
spellings in an iteration such that every pair of adjacent substrings, if any of the 
substrings is in a least one trusted lexicon, then only one of the substrings is allowed to 
change in that iteration (whereas, in a Viterbi search, each iteration corresponds to 
calculating a single survivor vector/path for a node/state/iteration (page 75). As shown 
in the for loop, each iteration/state, for each letter/substring of index 'k', of a string of 
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length m, there is an associated cost, that is calculated for possible change using array 
'A' in the iteration (page 77, see code 'procedure select(A,C,S,Z)') 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, Beeferman et al, Birman et al, 
and Hitachi's context model, to further include Srihari et al's method for implementing a 
Viterbi context model, which employs fringes. The combination would have allowed 
Tang et al's spell checking component to have "considered all alternatives for each of 
the m letters" (Srihari et al, page 75). 

With regards to claim 32, Tang et al, Shanahan et al, Beeferman et al, and Birman et 
al similarly teach a method comprising: 

• Receiving input data containing text, as explained in claim 1 , and is rejected 
under the same-rationale. 

• Identifying a set of potentially misspelled substrings in the text, as explained in 
claim 1, and is rejected under the same rationale. 

• The log comprising data utilized by users to query a data collection over a time 
frame, as similarly explained in the rejection for claim 1, and is rejected under the 
same rationale. 

• That includes searching for an optimum set of alternative spellings via utilization 
of a statistical context model (Beeferman et al, Table 2, lines 48-64: whereas, 
using statistical context query log data, a set of alternative spellings are 
identified) 
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• The statistical context model comprising substring occurrence, and co- 
occurrence statistics extracted from at least one query log, as similarly explained 
in the rejection for claim 1, and is rejected under the same rationale. 

• Proposing at least one alternative spelling for the substring set, as explained in 
claim 1, and is rejected under the same rationale. 

Additionally, Tang et al, Shanahan et al, Beeferman et al, Birman et al, and Hitachi 
teaches: 

• Generating a set of alternative spellings that are substrings in at least one 
selected from the group consisting of at least one query log and lexicon: as 
similarly explained in the rejection for claim 21, and is rejected under the same 
rationale. 

• The set of alternative spellings comprising a set of alternative spellings 
determined via an iterative correction process, as similarly explained in the 
rejection for claim 22, and is rejected under the same rationale. 

However, Tang et al, Shanahan, et al, Beeferman et al, Birman et al, and Hitachi do not 
expressly teach employing a Viterbi search to facilitate in determining the optimum set 
of alternative spellings according to the context model in each iteration; the Viterbi 
search can employ fringes to restrict a search for alternative spellings in an iteration 
such that every pair of adjacent substrings, if any of the substrings is in a least one 
trusted lexicon, then only of the substrings is allowed to change in that iteration. 
Srihari et al teaches: 
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• Employing a Viterbi search to facilitate in determining the optimum set of 
alternative spellings, as similarly explained in the rejection for claim 26, and is 
rejected under the same rationale. 

• ... according to the context model in each iteration. The Viterbi search can 
employ fringes to restrict a search for alternative spellings in an iteration such 
that every pair of adjacent substrings, if any of the substrings is in a least one 
trusted lexicon, then only one of the substrings is allowed to change in that 
iteration (whereas, in a Viterbi search, each iteration corresponds to 
calculating a single survivor vector/path for a node/state/iteration (page 75). 
As shown in the for loop, each iteration/state, for each letter/substring of index 
'k', of a string of length m, there is an associated cost, that is calculated for 
possible change using array TV in the iteration (page 77, see code 'procedure 
select(A,C,S,Z)') 

It would have been obvious to one of the ordinary skill in the art at the time of the 
invention to have modified Tang et al, Shanahan et al, Beeferman et al, Birman et al, 
and Hitachi's context model, to further include Srihari et al's method for implementing a 
Viterbi context model, which employs fringes. The combination would have allowed 
Tang et al's spell checking component to have "considered all alternatives for each of 
the m letters" (Srihari et al, page 75). 

With regards to claim 34, which depends on claim 32, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al similarly teach a method comprising: 
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• Employing, at least in part, a list of stop words to facilitate in determining at least 
one alternative in spelling; as explained in claim 4, and is rejected under the 
same rationale. 

• Utilizing substring occurrence and co-occurrence statistics from at least one 
query log, as explained in claim 1 , and is rejected under the same rationale. 

• The query log comprising a histogram of queries asked over a time frame, as 
explained in claim 9, and is rejected under the same rationale. 

• The substring occurrence and co-occurrence statistics from the query log are 
stored in a same searchable data structure, as explained in claim 1 , and is 
rejected under the same rationale. 

Additionally, Tang et al, Shanahan et al, Beeferman et al, Birman et al, and Hitachi 
teach: 

• Handling split substrings in the same manner as handling individual substrings, 
as explained in claim 20, and is rejected under the same rationale. 

With regards to claim 35, which depends on claim 34, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, and Hitachi similarly teach method comprising: 

• Changing at least one substring to another substring as an alternative spelling, 
as explained in claim 23, and is rejected under the same rationale. 

• Halting the iterative correction process when all possible alternative spellings are 
less appropriate than a current set of alternative spellings, as explained in claim 
23, and is rejected under the same rationale. 
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• The alternative spellings and their appropriateness are computed based on a 
probabilistic string distance and a statistical context model, as explained in claim 
24, and is rejected under the same rationale. 

With regards to claim 36, which depends on claim 35, Tang et al, Shanahan et al, 
Beeferman et al, Birman et al, Hitachi, and Srihari et al similarly teach a method 
comprising: 

• Utilizing a searchable data structure extracted from at least one query log and at 
least one trusted lexicon to generate the set of alternative spellings for a 
substring, as explained in claim 26, and is rejected under the same rationale. 

• Restricting the set of alternative spellings for each substring to within a 

> 

probabilistic distance from an input substring, the restriction being imposed withn 
each iteration without limiting the iterative correction process as a whole, as 
explained in claim 27, and is rejected under the same rationale. 
With regards to claim 41 , Tang et al, Shanahan et al, Beeferman et al, Birman et al, 
and Hitachi similarly teach a device employing the method of claim 32, comprising at 
least one of a computer, a server, and a handheld device, as explained in claim 32, and 
is rejected under the same rationale. 

10. Claim 25 is rejected under 35 U.S.C. 103(a) as being unpatentable over Tang et 
al (US Patent: 6,636,849 B1, issued: Oct. 21, 2003, filed: Nov. 22, 2000), Shanahan et 
al (US Application: US 2003/0033288 A1, published: Feb. 13, 2003, filed: Dec. 5, 2001), 
Beeferman et al (US Patent: 6,701,309 B1, issued: Mar. 2, 2004, filed: Apr. 21, 2000), 
Birman et al (US Patent: 6,616,704 B1, issued: Sep. 9, 2003, filed: Sep. 20, 2000), and 
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Hitachi (Derwent, published: Feb 16, 2001 , Abstract), in further view of Brill et al 
(Microsoft Research: 'An Improved Error Model for Noisy Channel Spelling Correction', 
published: 2000, 8 pages). 

With regards to claim 25, Tang et al, Shanahan et al, Beeferman et al, Birman et al, and 
Hitachi teach the probabilistic string distance, as similarly explained in the rejection for 
claim 24, and is rejected under the same rationale. 

However, Tang et al, Shannahan et al, Beeferman et al, Birman et al, and Hitachi do not 
expressly the probabilistic string distance comprises a modified context-dependent 
weighted Damerau-Levenshtein edit function that allows insertion, deletion, substitution, 
transposition, and long-distance movement of characters as point changes (Brill et al, 
Page 2, First paragraph of section 2 'An Improved Error Model': whereas the Damerau- 
Levenshtein distance measures where the distance between two strings is the minimum 
number of single character insertions, substitutions, and deletions, and transpositions: 
Thus, since the Damerau-Levenshtein distance does not require the edit distance to be 
only one, but instead, allows, the edit distance to be variable (but minimum), then long 
distance movement of characters is allowable). 

Response to Arguments 

11. Applicant's arguments with respect to claims 1-12, 14-16, 18-32, 34-36, 38, 41, 
and 42 have been considered but are moot in view of the new ground(s) of rejection. 

Conclusion 

12. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
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§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Wilson Tsui whose telephone number is (571)272-7596. 
The examiner can normally be reached on Monday - Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen Hong can be reached on (571) 272-4124. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



Wilson Tsui 
Patent Examiner 
Art Unit: 2178 
March 16, 2007 
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